16 research outputs found

    AutoBiasTest: Controllable Sentence Generation for Automated and Open-Ended Social Bias Testing in Language Models

    Full text link
    Social bias in Pretrained Language Models (PLMs) affects text generation and other downstream NLP tasks. Existing bias testing methods rely predominantly on manual templates or on expensive crowd-sourced data. We propose a novel AutoBiasTest method that automatically generates sentences for testing bias in PLMs, hence providing a flexible and low-cost alternative. Our approach uses another PLM for generation and controls the generation of sentences by conditioning on social group and attribute terms. We show that generated sentences are natural and similar to human-produced content in terms of word length and diversity. We illustrate that larger models used for generation produce estimates of social bias with lower variance. We find that our bias scores are well correlated with manual templates, but AutoBiasTest highlights biases not captured by these templates due to more diverse and realistic test sentences. By automating large-scale test sentence generation, we enable better estimation of underlying bias distribution

    Calendar.help: Designing a Workflow-Based Scheduling Agent with Humans in the Loop

    Full text link
    Although information workers may complain about meetings, they are an essential part of their work life. Consequently, busy people spend a significant amount of time scheduling meetings. We present Calendar.help, a system that provides fast, efficient scheduling through structured workflows. Users interact with the system via email, delegating their scheduling needs to the system as if it were a human personal assistant. Common scheduling scenarios are broken down using well-defined workflows and completed as a series of microtasks that are automated when possible and executed by a human otherwise. Unusual scenarios fall back to a trained human assistant who executes them as unstructured macrotasks. We describe the iterative approach we used to develop Calendar.help, and share the lessons learned from scheduling thousands of meetings during a year of real-world deployments. Our findings provide insight into how complex information tasks can be broken down into repeatable components that can be executed efficiently to improve productivity.Comment: 10 page

    Can You Label Less by Using Out-of-Domain Data? Active & Transfer Learning with Few-shot Instructions

    Full text link
    Labeling social-media data for custom dimensions of toxicity and social bias is challenging and labor-intensive. Existing transfer and active learning approaches meant to reduce annotation effort require fine-tuning, which suffers from over-fitting to noise and can cause domain shift with small sample sizes. In this work, we propose a novel Active Transfer Few-shot Instructions (ATF) approach which requires no fine-tuning. ATF leverages the internal linguistic knowledge of pre-trained language models (PLMs) to facilitate the transfer of information from existing pre-labeled datasets (source-domain task) with minimum labeling effort on unlabeled target data (target-domain task). Our strategy can yield positive transfer achieving a mean AUC gain of 10.5% compared to no transfer with a large 22b parameter PLM. We further show that annotation of just a few target-domain samples via active learning can be beneficial for transfer, but the impact diminishes with more annotation effort (26% drop in gain between 100 and 2000 annotated examples). Finally, we find that not all transfer scenarios yield a positive gain, which seems related to the PLMs initial performance on the target-domain task.Comment: Accepted to NeurIPS Workshop on Transfer Learning for Natural Language Processing, 2022, New Orlean

    HarborBot: A Chatbot for Social Needs Screening.

    No full text

    Designing Engaging Conversational Interactions for Health & Behavior Change

    No full text
    Thesis (Ph.D.)--University of Washington, 2021The recent popularity of chat and voice-based conversational interactions fueled by advances in natural language processing (NLP) has opened up opportunities for re-imagining user interactions in health & behavior change as conversational experiences. Prior work has indicated that a well-designed conversational approach can be more engaging, motivating, natural, personal, and understandable. It can also mimic the properties of some of the most successful human-led interventions, such as coaching and motivational interviewing. However, designing conversational interactions poses numerous challenges. Efficiently creating conversational content that is diverse, relevant for the context, and sounds natural is challenging. Furthermore, balancing the still limited AI capabilities with user expectations requires careful problem scoping and other design considerations. Finally, the mechanisms in which a successful conversational interaction can help improve user engagement are still not well explored. In this dissertation I propose 4 different conversational systems that address some of the fundamental health & behavior change challenges. In Chapter 3 to address the intrinsic challenge of user boredom and engagement loss with repeated interactions - I propose a conversational system with value-based conversation topic personalization and diversification. In Chapter 4 to address the challenge of engaging users in mindful self-learning from their behavioral data - I propose conversational systems supporting structured reflection on physical activity and on professional development at work. In Chapter 5 to support health data collection, especially to improve user comfort in sensitive topics and understandability among low-literacy populations - I propose a system for conversational survey administration. Finally in Chapter 6, to lower the effort involved in designing good quality conversational systems, I propose a tool for automated conversion of form-based surveys to a more engaging conversational format. My work identifies and provides evidence for several benefits of the use of conversational interactions in health & behavior change. Among others, I demonstrate the benefits of increased engagement in interaction, improved motivation for performing activities, accessibility benefits related to familiarity, ease of use, comfort with sharing, and an ability to guide the users in the behavior change process via dialogue. I also identify several important challenges: perceptions of artificiality, managing high expectations of contextual knowledge, and social intelligence, as well as lower efficiency that could negatively affect the experience for some user groups. I further investigate the concrete links between conversational design elements and these benefits and challenges. My thesis demonstrates various design processes and automation techniques that can lower the effort of designing conversational experiences. As technology progresses conversational interactions can offer valuable support complimenting the existing automated tracking and the efforts of human health coaches. My work offers an important contribution to our understanding of how conversational interactions can play such a beneficial role

    Stress Analytics in Education

    No full text
    During the years of college and university education students are exposed to different kinds of stress, especially during the difficult studying periods like final exams weeks or project deadlines. Stress on a long run is dangerous and can contribute to illness through its physiological effects or maladaptive health behaviors. Many students admit, or are self-aware, that they become stressed under different circumstances and have some clues about their potential stressor. Still, even for such students, the monitoring and awareness of stress are not systematic and based on subjective data, i.e. someone’s feelings. In our work we aim at providing means to students to become aware of the past, current and expected (objectively measured) stress and its correlation with their performance, to understand their stressors, to cope with and prevent stress- thus, to live healthier and happier lives and better organize their studies. 1

    Personalized stress management : enabling stress monitoring with LifelogExplorer

    Get PDF
    Stress is one of the major triggers for many diseases. Improving stress balance is therefore an important prevention step. With advances in wearable sensors, it becomes possible to continuously monitor and analyse user’s behavior and arousal in an unobtrusive way. In this paper, we report on a case study in which users (21 teachers of a vocational school) were provided with wearable sensors and could view their arousal information put in the context of their life events during the period of four weeks using our software tool in an unsupervised setting. The goal was to evaluate user engagement and enabling of self-coaching abilities. Our results show that users actively explored their arousal data during the study. Further qualitative evaluation conducted with 15 of 21 users indicated that 12 of 15 users were able to learn about their stress patterns based on the information they obtained, but only 5 of them were able to come up with practical interventions for improving their stress balance on their own, while other users were of opinion that nothing can be done to reduce their stress, which suggests that self-coaching has some potential but there is need in further coaching support

    A trust evaluation framework for sensor readings in body area sensor networks

    No full text
    This paper addresses a framework to evaluate trustworthiness of a Body Area Sensor Networks (BASN), in particular, of sensor readings. We show that such trustworthiness is to be interpreted with respect to a certain statement or goal; its evaluation is based on quality aspects derived from observations and opinions from others. We examine relevant quality aspects of sensor readings which correspond to potential deviating behaviors of sensors. We then look at how to derive such qualities from observations taking uncertainty into the evaluation as well as decay over time. We develop an extension of subjective logic for this purpose and we show how we can compute quality properties without storing long time series. We then demonstrate this for two examples, including Galvanic Skin Response (GSR) and Electrocardiography (ECG) sensed data.</p
    corecore